Efficiency and Computational Limitations of Learning Algorithms

نویسندگان

Vitaly Feldman

Leslie G. Valiant

چکیده

This thesis presents new positive and negative results concerning the learnability of several well-studied function classes in the Probably Approximately Correct (PAC) model of learning. Learning Disjunctive Normal Form (DNF) expressions in the PAC model is widely considered to be the main open problem in Computational Learning Theory. We prove that PAC learning of DNF expressions by an algorithm that produces DNF expressions as its hypotheses is NP-hard. We show that the learning problem remains NP-hard even if the learning algorithm can ask membership queries. We also prove that with an additional restriction on the size of hypotheses the learning remains NP-hard even with respect to the uniform distribution. These last two negative results are the first for learning in the PAC model with membership queries that are not based on cryptographic assumptions. We complement the hardness results above by presenting a new algorithm for learning DNF expressions with respect to the uniform distribution using membership queries. Our algorithm is attribute-efficient, noise-tolerant, and uses membership queries in a nonadaptive way. In terms of running time, it substantially improves on the best previously known algorithm of Bshouty et al. Learning of parities with random noise with respect to the uniform distribution is a famous open problem in learning theory and is also equivalent to a major open problem in coding theory. We show that an efficient algorithm for this problem would imply efficient algorithms for several other key learning problems with respect to the uniform distribution. In particular, we show that agnostic learning of parities (also referred to as learning with adversarial noise) reduces to learning parities with random classification noise. Together with the parity learning algorithm of Blum et al., this gives the first non-trivial algorithm for agnostic learning of parities. This reduction also implies that learning of DNF expressions reduces to learning noisy parities of just logarithmic number of variables. A monomial is a conjunction of (possibly negated) Boolean variables and is one of the simplest and most fundamental concepts. We show that even weak agnostic learning of monomials by an algorithm that outputs a monomial is NP-hard, resolving a basic open problem in the model. The proposed solutions rely heavily on tools from computational complexity and yield solutions to a number of problems outside of learning theory. Our hardness results are based on developing novel reductions from interactive proof systems for NP and known NP-hard approximation problems. Reductions and learning algorithms with respect to the uniform distribution are based on new techniques for manipulating the Fourier Transform of a Boolean function.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches‎. ‎Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data‎. ‎In this paper‎, ‎a methodology has been employed to opt...

متن کامل

Lot Streaming in No-wait Multi Product Flowshop Considering Sequence Dependent Setup Times and Position Based Learning Factors

This paper considers a no-wait multi product flowshop scheduling problem with sequence dependent setup times. Lot streaming divide the lots of products into portions called sublots in order to reduce the lead times and work-in-process, and increase the machine utilization rates. The objective is to minimize the makespan. To clarify the system, mathematical model of the problem is presented. Sin...

متن کامل

The machine learning process in applying spatial relations of residential plans based on samples and adjacency matrix

The current world is moving towards the development of hardware or software presence of artificial intelligence in all fields of human work, and architecture is no exception. Now this research seeks to present a theoretical and practical model of intuitive design intelligence that shows the problem of learning layout and spatial relationships to artificial intelligence algorithms; Therefore, th...

متن کامل

Two Novel Learning Algorithms for CMAC Neural Network Based on Changeable Learning Rate

Cerebellar Model Articulation Controller Neural Network is a computational model of cerebellum which acts as a lookup table. The advantages of CMAC are fast learning convergence, and capability of mapping nonlinear functions due to its local generalization of weight updating, single structure and easy processing. In the training phase, the disadvantage of some CMAC models is unstable phenomenon...

متن کامل

Improved teaching–learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having ‘g’ operations is performed on ‘g’ operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem...

متن کامل

دسته‌بندی داده‌های دورده‌ای با ابرمستطیل موازی محورهای مختصات

One of the machine learning tasks is supervised learning. In supervised learning we infer a function from labeled training data. The goal of supervised learning algorithms is learning a good hypothesis that minimizes the sum of the errors. A wide range of supervised algorithms is available such as decision tress, SVM, and KNN methods. In this paper we focus on decision tree algorithms. When we ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Efficiency and Computational Limitations of Learning Algorithms

نویسندگان

چکیده

منابع مشابه

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

Lot Streaming in No-wait Multi Product Flowshop Considering Sequence Dependent Setup Times and Position Based Learning Factors

The machine learning process in applying spatial relations of residential plans based on samples and adjacency matrix

Two Novel Learning Algorithms for CMAC Neural Network Based on Changeable Learning Rate

Improved teaching–learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

دسته‌بندی داده‌های دورده‌ای با ابرمستطیل موازی محورهای مختصات

عنوان ژورنال:

اشتراک گذاری